library(mixOmics)
library(tidyverse)

Feature level integration of data from 3 time-points

Feature level integration allows integrated analyses of same features measured from different samples on the same outcome variable. The advantage of this method is that features donot have be from the same samples. This approach has an advantage for this study considering that the full data-set cant be used if the phenotype annotation is available.

Here we uses “mixOmics” package approach in an integrated analyses of the protein data from 3 timepoints in order to capture features from all three timepoints and their joint correlation with the outcome variable.

Some factors to consider

  • The analyses in the study come from the same individuals at three time points and maybe confounded to repetition and over fitting is possible
  • The purpose of this analyses is to extract most important features and not cross-validation
  • There is no independent validation on a test set to test classification

Notes on feature selection and PCs

To select the PCs capturing most of the variation on the phenotype variable, an iterative search to select the optimal number of PCs and features > > associated with those PCs was performed. The selected features can then be used assess their ability to classify the phenotype variables.

In the first step we tune the number of PCs that capture most of the variation on the outcome variable. As in the following examples, The first PC captures most of the biological variation.

plot(mint_res$perf.radc.pcs, col = color.mixo(5:7))
Fig 1: Number of PCs capturing maximum variation in the data

Fig 1: Number of PCs capturing maximum variation in the data

After selection of PCs, an iterative selection using the leave one out procedure, is performed to select for optimal number of features. These features are then used to assess the classification error.

Randomization group associated features

mint_res <- list()
mint_res$mint_rand = mint.plsda(X = assays(data_all_f)$loess %>% t, 
                       Y = as.factor(data_all_f$randomisation_code), 
                       study = data_all_f$time, ncomp = 5)

Indvidual plots

par(mar = c(4, 4, .1, .1))

plotIndiv(mint_res$mint_rand, legend = TRUE, title = 'Mint splsda: Temperature randomization', 
        subtitle = 'Full data', ellipse = T)

plotIndiv(mint_res$rand.splsda.res, study = 'global', legend = TRUE,  
          subtitle = 'Selected features', ellipse=T)
Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)

Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)

Individual plot for each of the timepoints

plotIndiv(mint_res$rand.splsda.res, study = 'all.partial',  title = 'MINT sPLS-DA', 
          subtitle = c("24h", "48h", "72h"))

Feature level plots

ROC curves

par(mar = c(4, 4, 4, 4))
auroc(mint_res$rand.splsda.res )
## $Comp2
##          AUC   p-value
## 0 vs 1 0.773 1.866e-11
auroc(mint_res$rand.splsda.res, roc.study =  "-24-" )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.8378 6.863e-07
auroc(mint_res$rand.splsda.res, roc.study =  "-48-"  )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.9502 1.422e-10
auroc(mint_res$rand.splsda.res, roc.study =  "-72-"  )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.9108 3.571e-08

Heatmap of selected proteins

Correlation circle plot

The correlation circle plot shows the relation ship between selected variables from each of the time points on separation at two dimensional plot.

plotVar(mint_res$rand.splsda.res)

Loadings plot

plotLoadings(mint_res$rand.splsda.res, study = "all.partial")

CPC Score

Indvidual plots

par(mar = c(4, 4, .1, .1))

plotIndiv(mint_res$mint_cpc, legend = TRUE, title = 'Mint splsda: CPC score (dichotamised)', 
        subtitle = 'Full data', ellipse = T)

plotIndiv(mint_res$cpc.splsda.res, study = 'global', legend = TRUE,  
          subtitle = 'Selected features', ellipse=T)
Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)

Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)

Individual plot for each of the timepoints

plotIndiv(mint_res$cpc.splsda.res, study = 'all.partial',  title = 'MINT sPLS-DA', 
          subtitle = c("24h", "48h", "72h"))

Feature level plots

Roc curve

par(mar = c(4, 4, 4, 4))
auroc(mint_res$cpc.splsda.res )
## $Comp2
##           AUC p-value
## 0 vs 1 0.9007       0
auroc(mint_res$cpc.splsda.res, roc.study =  "-24-" )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.8814 3.463e-08
auroc(mint_res$cpc.splsda.res, roc.study =  "-48-"  )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.9299 1.139e-09
auroc(mint_res$cpc.splsda.res, roc.study =  "-72-"  )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.9513 1.587e-09

Heatmap of selected proteins

### Correlation circle plot

The correlation circle plot shows the relation ship between selected variables from each of the time points on separation at two dimensional plot.

plotVar(mint_res$cpc.splsda.res)

Loadings plot

plotLoadings(mint_res$cpc.splsda.res, study = "all.partial")

Shockable vs. non shockable heart

Indvidual plots

par(mar = c(4, 4, .1, .1))

plotIndiv(mint_res$mint_shock, legend = TRUE, title = 'Mint splsda: shockable vs. non shockable', 
        subtitle = 'Full data', ellipse = T)

plotIndiv(mint_res$shock.splsda.res, study = 'global', legend = TRUE,  
          subtitle = 'Selected features', ellipse=T)
Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)

Fig 2: Individual plots indicating sample grouping before (left) and after feature selection (right)

Individual plot for each of the timepoints

plotIndiv(mint_res$shock.splsda.res, study = 'all.partial',  title = 'MINT sPLS-DA', 
          subtitle = c("24h", "48h", "72h"))

Feature level plots

Roc curves

par(mar = c(4, 4, 4, 4))
auroc(mint_res$shock.splsda.res )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.6607 0.0008015
auroc(mint_res$shock.splsda.res, roc.study =  "-24-" )
## $Comp2
##          AUC   p-value
## 0 vs 1 0.961 1.728e-09
auroc(mint_res$shock.splsda.res, roc.study =  "-48-"  )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.9205 2.415e-07
auroc(mint_res$shock.splsda.res, roc.study =  "-72-"  )
## $Comp2
##           AUC   p-value
## 0 vs 1 0.9048 4.109e-05

Heatmap of selected proteins

Correlation circle plot

The correlation circle plot shows the relation ship between selected variables from each of the time points on separation at two dimensional plot.

plotVar(mint_res$shock.splsda.res)

Loadings plot

plotLoadings(mint_res$shock.splsda.res, study = "all.partial")

1 + 1
## [1] 2
knitr::knit_exit()